Document-term matrix について

Words near each other

・ Dodd City Independent School District
・ Dodd City, Texas
・ Dodd College
・ Dodd Fell Hill
・ Document Style Semantics and Specification Language
・ Document theft
・ Document type declaration
・ Document type definition
・ Document Update Markup Language
・ Document warehouse
・ Document Z-3
・ Document! X
・ Document-based question
・ Document-centric collaboration
・ Document-oriented database
・ Document-term matrix
・ Document.no
・ Documenta
・ DOCUMENTA (13)
・ Documenta - Centre for Dealing with the Past
・ Documenta 12 magazines
・ Documentalist
・ Documentality
・ Documentaly
・ Documentaries and minor subjects of the Thanhouser Company
・ Documentary '60
・ Documentary (disambiguation)
・ Documentary (TV channel)
・ Documentary Center
・ Documentary channel

Dictionary Lists

mini英和辞書

翻訳と辞書　辞書検索 [ 開発暫定版 ]

スポンサードリンク

Document-term matrix ：ウィキペディア英語版

Document-term matrix

A document-term matrix or term-document matrix is a mathematical matrix that describes the frequency of terms that occur in a collection of documents. In a document-term matrix, rows correspond to documents in the collection and columns correspond to terms. There are various schemes for determining the value that each entry in the matrix should take. One such scheme is tf-idf. They are useful in the field of natural language processing.
==General Concept==
When creating a database of terms that appear in a set of documents the document-term matrix contains rows corresponding to the documents and columns corresponding to the terms. For instance if one has the following two (short) documents:
*D1 = "I like databases"
*D2 = "I hate databases",
then the document-term matrix would be:
which shows which documents contain which terms and how many times they appear.
Note that more sophisticated weights can be used; one typical example, among others, would be tf-idf.

抄文引用元・出典: フリー百科事典『ウィキペディア（Wikipedia）』
■ウィキペディアで「Document-term matrix」の詳細全文を読む

スポンサードリンク

翻訳と辞書 : 翻訳のためのインターネットリソース